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ABSTRACT 

Background: Ishak and METAVIR scoring systems are among the most commonly used histopathological 
systems to evaluate chronic hepatitis. 

Objective: To assess the level of agreement between these two scoring systems in patients with chronic 
hepatitis B. 

Methods: Liver biopsy samples taken from 92 patients with chronic hepatitis B were considered as the 
training set; 57 more biopsy specimens were used as the validation set. In the training set, grade of necro- 
inflammation and stage of fibrosis for each liver biopsy specimen were determined by two expert liver 
pathologists using both Ishak and METAVIR systems. Inter-observer variability between the two patholo- 
gists was evaluated. Biopsy specimens of the validation set were seen and scored by a third expert pa- 
thologist. In the training set, criteria were developed to categorize Ishak grading and staging systems 
separately to best fit with the METAVIR scoring system. The criteria found in the training set, was then 
tested in the validation set. The level of agreement between the two scoring systems was assessed by 
weighted kappa statistics. 

Results: For the training set, agreement between the two pathologists was excellent. Using our proposed 
criteria in the training set, there was excellent level of agreement in grading (k = 0.89] and staging (k = 
0.99] between Ishak and METAVIR systems. In the validation set, the criteria led to substantial correla- 
tion (k = 0.61] in grading, and excellent correlation (k = 0.94] in staging between the two systems. 

Conclusion: Using our proposed criteria, excellent or at least substantial concordance between Ishak and 
METAVIR scoring systems can be achieved for the degree of both necro-inflammatory changes and fibro- 
sis. 

KEYWORDS: Hepatitis B; Chronic hepatitis; Staging; Ishak; METAVIR 



*Correspondence: Reza Malekzadeh, MD, Digestive Disease 
Research Center, Shariati Hospital, North Kargar Ave, 
Tehran 14117, Iran 

Tel: +98-21-8241-5300 

Fax: +98-21-8241-5400 

E-mail: malek@ams.ac.ir 



INTRODUCTION 

Liver biopsy is considered the gold stan- 
dard for assessing the grade of liver 
injury and stage of liver fibrosis in 
patients with chronic hepatitis. In attempt to 
standardize assessment of liver histology by 
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pathologists, several scoring systems have 
been developed. Among these, modified His- 
tology Activity Index (HAI) developed by 
Ishak, et al [TJ, and the METAVIR system 
[2,3j are used most widely. While a large 
number of researchers use Ishak system to as- 
sess liver histology in chronic hepatitis stud- 
ies (T,5j, other researchers — mostly from Eu- 
rope — prefer the METAVIR system [jEf]. 

Each of these scoring systems provide reliable 
scores, with relatively little intra- and inter- 
observer variations [3,7j. In a recent study, a 
good concordance between Ishak and META- 
VIR systems was reported £8j, though varia- 
tion was greater for necro-inflammatory fea- 
tures than for fibrosis and cirrhosis. 

It is, however, unclear whether a given score 
in the Ishak system predictably corresponds 
to a certain score in the METAVIR system. 
Concordance of the two systems in the grad- 
ing of necro-inflammatory changes is more 
problematic. It is not known if individual com- 
ponents of the grading scores in the Ishak sys- 
tem (e.g., interface hepatitis, confluent necrosis, 
etc) contribute to this correlation. We, there- 
fore, attempted to identify criteria in the Ishak 
system which corresponds to the METAVIR 
score. 



MATERIALS AND METHODS 

One hundred and sixty eight consecutive liver 
biopsies from treatment naive chronic hepati- 
tis B virus (HBV) carriers sent to the Depart- 
ment of Pathology of our center between 2004 
and 2005 were prospectively evaluated. All 
patients were chronic carriers of HBV docu- 
mented with two positive HBs Ag tests, at 
least six months apart. Informed written con- 
sent for the study was given by each patient 
prior to liver biopsy. 

Biopsy samples received between January 
2004 and March 2005 were considered as 
training set, and the samples received between 
April 2005 and December 2005 were consid- 
ered as validation set. 

Nineteen out of the 168 samples were excluded 
because of inadequate size of the specimens (e.g., 




less than four portal tracts). Thus, 92 speci- 
mens were included as the training set, and 
57 as the validation set. All specimens were 
fixed in 10% formalin, and embedded in paraf- 
fin. For each case, three sections were stained 
by hematoxylin-eosin, mason-trichrome, and 
reticulin. All biopsy specimens in the training 
group were seen by two pathologists expert 
in liver pathology — they had at least 10 years 
experience of practice in the field of liver pa- 
thology in an academic center. In the train- 
ing set, all slides were reviewed by each pa- 
thologist and were scored by the Ishak system 
[TJ. Subsequently, all specimens were scored 
by METAVIR system g,3j. Each pathologist 
worked independently and was blinded to the 
results of the readings of the other colleague 
and the readings by the other system. Inter- 
observer agreement was evaluated by kappa 
statistics. Then, all discordances between the 
two pathologists were resolved by agreement 
in joint sessions. 

In order to compare the two scoring systems, 
we tried to equalize the number of categories 
in the two systems. Grading of necro-inflam- 
mation has four components in the Ishak sys- 
tem and includes "a": interface hepatitis; "b": 
confluent necrosis; "c": focal lytic necrosis; and 
"d": portal inflammation [TJ. Grading of the 
METAVIR system is simply classified as AO 
to A3 based on the severity of the necro-in- 
flammation [3j. 

We pooled the grading scores of the Ishak 
system into four groups (i.e., minimal, mild, 
moderate, and severe necro-inflammation). 
We tried to modify individual components of 
Ishak grading system in each group to find the 
categorization which best fits the METAVIR 
grading system (Table l). These groups were 
compared with the four groups of METAVIR 
grading system by kappa statistics. 

Additionally, we compared grading of META- 
VIR with the previously mentioned catego- 
ries of the Ishak system, i.e., minimal necro- 
inflammation (total grades of 1-3), mild (total 
grades of 4-8), moderate (total grades of 9-12) 
or severe (total grades of 13-18) QTj. 

The Ishak system scores fibrosis into seven 
categories (0-6), while the METAVIR system 
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Tablel: Proposed criteria for comparison between Ishak and METAVIR scoring systems 



Grading of necro-inflammation in the Ishak system* 



Grading of necro-inflammation in the 
METAVIR system 



Minimal 


aO, bO, cO-1, dO-1 


AO 




aO.bO, c > 2 or and/or d > 2 




Mild f 


al-2, bO, any c, any d 
aO, b > 1, any c, any d 


Al 


Moderate 


al-2, b > 1, any c, any d 
a 3-4, bO, any c, any d 


A2 


Severe 


a3-4, b > 1, any c, any d 


A3 



*In Ishak system letter "a" denotes interface hepatitis (piecemeal necrosis); "b" confluent necro- 
sis; "c" focal lytic necrosis; and "d" portal inflammation. 

"Either of these three different conditions are considered as mild necro-inflammation. 



scores liver fibrosis into five groups (FO— F4). 
We modified the Ishak fibrosis scoring system 
by reducing the seven categories to five, and 
found the categorization which best fits the 
five groups of the METAVIR system in the 
training set (Table 2). 

Then, the 57 biopsy specimens of the valida- 
tion set were evaluated by a third expert pa- 
thologist. He first scored all the slides by the 
Ishak system. He then scored the slides by 
the METAVIR system. Categorization of the 
Ishak grading (Table l) as well as Ishak stag- 
ing (Table 2) obtained from the training set, 
was applied for the validation set. 

The same statistical analysis was done to com- 
pare Ishak and METAVIR systems in the val- 
idation group. 



Table 2: Proposed criteria for comparison 
between Ishak and METAVIR scoring systems 



Staging of liver fibrosis 



Ishak system 


METAVIR system 


0 


FO 


1 or 2 


Fl 


3 F2 


4 or 5 


F3 


6 


F4 



Statistical analysis 

Weighted kappa statistics were used to deter- 
mine the level of agreement in each analysis. 

A K < 0.2 was considered as "slight" concor- 
dance; 0.2-0.39 considered "fair," 0.4-0.59 
"moderate," 0.6-0.79 as substantial, and K > 
0.8 was considered "excellent" or "almost per- 
fect" level of concordance. 



RESULTS 

In the training set, 67 (73%) patients were male, 
and 25 (27%) were female. The mean+SD age 
of patients in the training set was 38.5+12.0 
years. In the validation set, 35 (61%) patients 
were male, and 22 (39%) were female. The 
mean+SD age of patients in the validation set 
was 35.5+10.7 years. The biopsy specimens 
included a median number of seven (range: 
4-22) portal tracts. 



Inter-observer variability in the training set 

In general, agreement between the two pa- 
thologists was excellent. In the Ishak system, 
the k statistics of inter-observer agreement 
was 0.90 for interface hepatitis, 0.92 for con- 
fluent necrosis, 0.80 for focal necrosis, 0.87 for 
portal inflammation, and 0.86 for staging of 
fibrosis. In the METAVIR system, the k of in- 
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Table 3: Comparison between METAVIR grading 
system, and the proposed criteria for Ishak 
grading categorization in the training set* 



METAVIR 

grading 




Ishak 


grading 




Minimal 


Mild 


Moderate 


Severe 


AO 


22 


0 


0 


0 


Al 


0 


33 


4 


0 


A2 


0 


3 


17 


0 


A3 


0 


0 


0 


13 


Total 


22 


36 


21 


13 



*k = 0.89 

ter-observer agreement was 0.81 for grading 
of necro-inflammation, and 0.82 for staging of 
fibrosis. 

Correlation between the two systems in the 
training set 

Using the Ishak system, the mean+SD stage 
of liver fibrosis was 1.30+1.55 (range: 0—6). 
The mean+SD grade of necro-inflammation 
was 4.89+3.01 (range: 1-15) in the training 
set. Evaluation of the different components 
of the Ishak grading system showed that the 
mean+SD score of interface hepatitis was 
1.18+1.10 (range: 0-4); the mean+SD score of 
confluent necrosis was 0.63+1.03 (range: 0—5); 
the mean+SD score of focal lytic necrosis- 
apoptosis was 1.52+0.54 (range: 0-3); and the 
mean+SD score of portal inflammation was 
1.55+0.89 (range: 0-4). 

We categorized the Ishak scores of the train- 
ing set (Table l), and compared the groups 
by the METAVIR grading system. Table 3 
shows a comparison of the necro-inflammato- 
ry scores between the two systems according 
to our proposed criteria. There was an excel- 
lent correlation (k = 0.89) between the two 
systems using the proposed criteria. 

However, when we compared grading of 
METAVIR with the previously suggested cat- 
egories of the Ishak grading system [S], we 
found a much weaker correlation between the 
two systems (k = 0.18). Using our criteria, the 
correlation between staging of METAVIR 
and staging of Ishak system was excellent (k 
= 0.99) (Table 4). 



Correlation between the two systems in the 
validation set 

Using the Ishak system, the mean+SD stage 
of liver fibrosis was 1.60+1.27 (range: 0—5). 
The mean+SD grade of necro-inflammation 
was 4.18+2.18 (range: 1-13) in the validation 
set. Evaluation of the different components 
of the Ishak grading system showed that the 
mean+SD score of interface hepatitis was 
0.70+0.73 (range: 0-3); the mean+SD score 
of confluent necrosis was 0.26+0.84 (range: 
0—4); the mean+SD score of focal lytic necro- 
sis-apoptosis was 1.51+0.63 (range: 0—3); and 
the mean+SD score of portal inflammation 
was 1.70+0.65 (range: 1-3). 

Table 5 shows a comparison of the necro-in- 
flammatory scores between the two systems 
according to our proposed criteria in the vali- 
dation set. There was substantial correlation 
(k = 0.61) between the two systems using the 
proposed criteria. 

We found that most of the discordances be- 
tween our criteria and METAVIR grades are 
in the minimal and mild necro-inflammation 
groups (e.g., AO, and Al in METAVIR). When 
we combined minimal and mild necro-inflam- 
mation as one group in the Ishak system (e.g., 
minimal/mild vs moderate vs severe inflam- 
mation), and merged AO, and Al of META- 
VIR as one group (e.g., Ao/Al vs A2 vs A3), the 
correlation between the two grading systems 
was perfect (k = 1.0) in the validation set. 

Then, we analyzed the correlation between 
the two grading systems using the previously 
suggested categories pf] for the Ishak system. 
In the old categorization, the correlation be- 
tween Ishak and METAVIR grading systems 
was slight (k = 0.19). 

Using our criteria, the correlation between 
staging of METAVIR and staging of Ishak 
system was excellent (k = 0.94) in the valida- 
tion set (Table 6). 

Sources of discrepancy in grading of necro- 
inflammation 

To find out the sources of discrepancies, we 
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Table 4: Comparison between scores for liver 
fibrosis obtained by METAVIR and Ishak scoring 
systems in the training set* 



METAVIR 

system 






Ishak system 




o 




0 Q 

z o 


4 or 5 


(j 


FO 


35 


0 


0 


0 


0 


Fl 


1 


37 


0 


0 


0 


F2 


0 


0 


8 


0 


0 


F3 


0 


0 


0 


10 


0 


F4 


0 


0 


0 


0 


1 


Total 


36 


37 


8 


10 


1 


*K=0.99 



carefully analyzed the scores given by the pa- 
thologists in the training set, and the scores 
given by the pathologist in the validation set. 
As expected, the discrepancies were mostly in 
the minimal/mild necro-inflammation. 



The following finding was the main source 
of discrepancy In the training set, the slides 
with the Ishak score of aO, bo, cl, and dl were 
respectively scored in the METAVIR as lobu- 
lar necrosis: 0, portal inflammation: 1, piece- 
meal necrosis: 0, and bridging necrosis: 0. 
This corresponds to AO in the METAVIR sys- 
tem. However, in the validation set, the slides 
with similar Ishak score (e.g., aO, bO, cl, and 
dl) were respectively scored in the METAVIR 
as lobular necrosis: 1, portal inflammation: 1, 
piecemeal necrosis: 0, and bridging necrosis: 
0. This corresponds to Al in the METAVIR 
system \~3~~}. 



DISCUSSION 

The aim of various histological scoring sys- 
tems for chronic hepatitis is that the same def- 
inition of activity be used by all pathologists 
fjjf]. The different scoring rules of each system 
limit our ability to predictably convert scores 
between them. In this study, we propose cri- 
teria which allow more direct comparison of 
Ishak and METAVIR scores. 

Using our proposed categorization for the 



Ishak system, correlation of the grading be- 
tween the two systems was excellent (k = 
0.89) in the training and substantial (k = 0.61) 
in the validation set. 

We found that most of the discrepancies ob- 
served between our suggested categorization 
and the validation set were attributed to the 
discrimination of minimal from mild necro- 
inflammation. When we merged minimal and 
mild necro-inflammation as one group, the 
correlation of the suggested categorization 
(Table l) and the validation set was perfect (k 
= 1.0). Therefore, our suggested criteria were 
particularly accurate for discriminating mini- 
mal/mild vs moderate vs severe necro-inflam- 
mation. 

We found that components "a" (e.g., interface 
hepatitis) and "b" (e.g., confluent necrosis) of 
the Ishak system play important role in the 
correlation with the METAVIR grading sys- 
tem. Furthermore, consistent with the find- 
ings of the METAVIR cooperative study 
group, interface hepatitis (piecemeal necrosis) 
and lobular necrosis are more important for 
the grading of METAVIR system [3j. 

In this study, we compared the two grad- 
ing systems based on categorical rather than 
numerical data. In an earlier comparison be- 
tween the two systems, the total grading of 
the Ishak was compared with the METAVIR 
grading system £8]. However, since each com- 
ponent of the Ishak system differs from other 
components in terms of scale and importance, 
individual components of the Ishak grading 



Table 5: Comparison between METAVIR grading 
system, and the proposed criteria for Ishak 
grading categorization in the validation set* 



METAVIR 




Ishak 


grading 




grading 


Minimal 


Mild 


Moderate 


Severe 


AO 


2 


0 


0 


0 


Al 


9 


39 


0 


0 


A2 


0 


0 


6 


0 


A3 


0 


0 


0 


1 


Total 


11 


39 


6 


1 


*K = 0.61 
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Table 6: Comparison between scores for staging 
of liver fibrosis obtained by METAVIR and Ishak 
staging systems in the validation set* 



scores allowed accurate translation to the cor- 
responding scores of the METAVIR system. 



METAVIR 






Ishak system 




system 


0 


1 or 


2 3 


4 or 
5 


(i 


rO 


y 


0 


0 


0 


0 


Fl 


l 


35 


0 


0 


0 


F2 


0 


1 


7 


0 


0 


F3 


0 


0 


0 


4 


0 


F4 


0 


0 


0 


0 


0 


Total 


10 


36 


7 


4 


0 


*k = 0.94 



system should be individually analyzed fj9]. 
Indeed, we found a poor correlation between 
the two grading systems (k = 0.18) when we 
compared the two systems according to the 
numerical data. 

Stages 1 and 2 of the Ishak system represent 
mild fibrosis without bridging. These scores 
were compatible with Fl of METAVIR sys- 
tem. Stages 4 and 5 of the Ishak system rep- 
resent advanced bridging fibrosis and/or the 
beginning of nodule formation. These scores 
correspond to F3 in the METAVIR system. 
According to our proposed categorization 
(Table 2), correlation between the two systems 
was almost perfect (k of 0.99 in the training, 
and 0.94 in the validation set). 

We also determined inter-observer variability 
of each scoring system in the training set. We 
found an excellent agreement between the two 
pathologists in both Ishak and METAVIR 
scoring systems. In another study of inter-ob- 
server variability for the Ishak system a mod- 
erate to good agreement was reported £lCp. 
The excellent inter- observer agreement found 
in our study may be explained by the high 
level of the expertise of the two pathologists. 
Furthermore, the percentage of agreement is 
dependent on the number of observers. In con- 
clusion, we found that either scoring systems 
could be applied for grading and staging of 
chronic liver diseases. Categorization of Ishak 
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